Goto

Collaborating Authors

 long short-term memory


Long short-term memory and Learning-to-learn in networks of spiking neurons

Neural Information Processing Systems

Recurrent networks of spiking neurons (RSNNs) underlie the astounding computing and learning capabilities of the brain. But computing and learning capabilities of RSNN models have remained poor, at least in comparison with ANNs. We address two possible reasons for that. One is that RSNNs in the brain are not randomly connected or designed according to simple rules, and they do not start learning as a tabula rasa network. Rather, RSNNs in the brain were optimized for their tasks through evolution, development, and prior experience. Details of these optimization processes are largely unknown. But their functional contribution can be approximated through powerful optimization methods, such as backpropagation through time (BPTT). A second major mismatch between RSNNs in the brain and models is that the latter only show a small fraction of the dynamics of neurons and synapses in the brain. We include neurons in our RSNN model that reproduce one prominent dynamical process of biological neurons that takes place at the behaviourally relevant time scale of seconds: neuronal adaptation.


Quantum RNNs and LSTMs Through Entangling and Disentangling Power of Unitary Transformations

Daskin, Ammar

arXiv.org Artificial Intelligence

In this paper, we present a framework for modeling quantum recurrent neural networks (RNNs) and their enhanced version, long short-term memory (LSTM) networks using the core ideas presented by Linden et al. (2009), where the entangling and disentangling power of unitary transformations is investigated. In particular, we interpret entangling and disentangling power as information retention and forgetting mechanisms in LSTMs. Thus, entanglement emerges as a key component of the optimization (training) process. We believe that, by leveraging prior knowledge of the entangling power of unitaries, the proposed quantum-classical framework can guide the design of better-parameterized quantum circuits for various real-world applications.


QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory

Hsu, Yu-Chao, Jiang, Jiun-Cheng, Lin, Chun-Hua, Peng, Kuo-Chung, Chen, Nan-Yow, Chen, Samuel Yen-Chi, Kuo, En-Jui, Goan, Hsi-Sheng

arXiv.org Artificial Intelligence

Long short-term memory (LSTM) models are a particular type of recurrent neural networks (RNNs) that are central to sequential modeling tasks in domains such as urban telecommunication forecasting, where temporal correlations and nonlinear dependencies dominate. However, conventional LSTMs suffer from high parameter redundancy and limited nonlinear expressivity. In this work, we propose the Quantum-inspired Kolmogorov-Arnold Long Short-Term Memory (QKAN-LSTM), which integrates Data Re-Uploading Activation (DARUAN) modules into the gating structure of LSTMs. Each DARUAN acts as a quantum variational activation function (QVAF), enhancing frequency adaptability and enabling an exponentially enriched spectral representation without multi-qubit entanglement. The resulting architecture preserves quantum-level expressivity while remaining fully executable on classical hardware. Empirical evaluations on three datasets, Damped Simple Harmonic Motion, Bessel Function, and Urban Telecommunication, demonstrate that QKAN-LSTM achieves superior predictive accuracy and generalization with a 79% reduction in trainable parameters compared to classical LSTMs. We extend the framework to the Jiang-Huang-Chen-Goan Network (JHCG Net), which generalizes KAN to encoder-decoder structures, and then further use QKAN to realize the latent KAN, thereby creating a Hybrid QKAN (HQKAN) for hierarchical representation learning. The proposed HQKAN-LSTM thus provides a scalable and interpretable pathway toward quantum-inspired sequential modeling in real-world data environments.


Hybrid LSTM and PPO Networks for Dynamic Portfolio Optimization

Kevin, Jun, Yugopuspito, Pujianto

arXiv.org Artificial Intelligence

This paper introduces a hybrid framework for portfolio optimization that fuses Long Short-Term Memory (LSTM) forecasting with a Proximal Policy Optimization (PPO) reinforcement learning strategy. The proposed system leverages the predictive power of deep recurrent networks to capture temporal dependencies, while the PPO agent adaptively refines portfolio allocations in continuous action spaces, allowing the system to anticipate trends while adjusting dynamically to market shifts. Using multi-asset datasets covering U.S. and Indonesian equities, U.S. Treasuries, and major cryptocurrencies from January 2018 to December 2024, the model is evaluated against several baselines, including equal-weight, index-style, and single-model variants (LSTM-only and PPO-only). The framework's performance is benchmarked against equal-weighted, index-based, and single-model approaches (LSTM-only and PPO-only) using annualized return, volatility, Sharpe ratio, and maximum drawdown metrics, each adjusted for transaction costs. The results indicate that the hybrid architecture delivers higher returns and stronger resilience under non-stationary market regimes, suggesting its promise as a robust, AI-driven framework for dynamic portfolio optimization.


Machine Learning vs. Randomness: Challenges in Predicting Binary Options Movements

Arantes, Gabriel M., Pinto, Richard F., Dalmazo, Bruno L., Borges, Eduardo N., Lucca, Giancarlo, de Mattos, Viviane L. D., Cardoso, Fabian C., Berri, Rafael A.

arXiv.org Artificial Intelligence

Binary options trading is often marketed as a field where predictive models can generate consistent profits. However, the inherent randomness and stochastic nature of binary options make price movements highly unpredictable, posing significant challenges for any forecasting approach. This study demonstrates that machine learning algorithms struggle to outperform a simple baseline in predicting binary options movements. Using a dataset of EUR/USD currency pairs from 2021 to 2023, we tested multiple models, including Random Forest, Logistic Regression, Gradient Boosting, and k-Nearest Neighbors (kNN), both before and after hyperparameter optimization. Furthermore, several neural network architectures, including Multi-Layer Perceptrons (MLP) and a Long Short-Term Memory (LSTM) network, were evaluated under different training conditions. Despite these exhaustive efforts, none of the models surpassed the ZeroR baseline accuracy, highlighting the inherent randomness of binary options. These findings reinforce the notion that binary options lack predictable patterns, making them unsuitable for machine learning-based forecasting.


Long short-term memory and Learning-to-learn in networks of spiking neurons

Neural Information Processing Systems

Recurrent networks of spiking neurons (RSNNs) underlie the astounding computing and learning capabilities of the brain. But computing and learning capabilities of RSNN models have remained poor, at least in comparison with ANNs. We address two possible reasons for that. One is that RSNNs in the brain are not randomly connected or designed according to simple rules, and they do not start learning as a tabula rasa network. Rather, RSNNs in the brain were optimized for their tasks through evolution, development, and prior experience. Details of these optimization processes are largely unknown. But their functional contribution can be approximated through powerful optimization methods, such as backpropagation through time (BPTT). A second major mismatch between RSNNs in the brain and models is that the latter only show a small fraction of the dynamics of neurons and synapses in the brain. We include neurons in our RSNN model that reproduce one prominent dynamical process of biological neurons that takes place at the behaviourally relevant time scale of seconds: neuronal adaptation.


Bitcoin Price Forecasting Based on Hybrid Variational Mode Decomposition and Long Short Term Memory Network

Boadi, Emmanuel

arXiv.org Artificial Intelligence

This study proposes a hybrid deep learning model for forecasting the price of Bitcoin, as the digital currency is known to exhibit frequent fluctuations. The models used are the Variational Mode Decomposition (VMD) and the Long Short-Term Memory (LSTM) network. First, VMD is used to decompose the original Bitcoin price series into Intrinsic Mode Functions (IMFs). Each IMF is then modeled using an LSTM network to capture temporal patterns more effectively. The individual forecasts from the IMFs are aggregated to produce the final prediction of the original Bitcoin Price Series. To determine the prediction power of the proposed hybrid model, a comparative analysis was conducted against the standard LSTM. The results confirmed that the hybrid VMD+LSTM model outperforms the standard LSTM across all the evaluation metrics, including RMSE, MAE and R2 and also provides a reliable 30-day forecast.


Benchmarking Quantum and Classical Sequential Models for Urban Telecommunication Forecasting

Chen, Chi-Sheng, Chen, Samuel Yen-Chi, Tsai, Yun-Cheng

arXiv.org Artificial Intelligence

In this study, we evaluate the performance of classical and quantum-inspired sequential models in forecasting univariate time series of incoming SMS activity (SMS-in) using the Milan Telecommunication Activity Dataset. Due to data completeness limitations, we focus exclusively on the SMS-in signal for each spatial grid cell. We compare five models, LSTM (baseline), Quantum LSTM (QLSTM), Quantum Adaptive Self-Attention (QASA), Quantum Receptance Weighted Key-Value (QRWKV), and Quantum Fast Weight Programmers (QFWP), under varying input sequence lengths (4, 8, 12, 16, 32 and 64). All models are trained to predict the next 10-minute SMS-in value based solely on historical values within a given sequence window. Our findings indicate that different models exhibit varying sensitivities to sequence length, suggesting that quantum enhancements are not universally advantageous. Rather, the effectiveness of quantum modules is highly dependent on the specific task and architectural design, reflecting inherent trade-offs among model size, parameterization strategies, and temporal modeling capabilities.


Machine Generalize Learning in Agent-Based Models: Going Beyond Surrogate Models for Calibration in ABMs

Najafzadehkhoei, Sima, Yon, George Vega, Modenesi, Bernardo, Meyer, Derek S.

arXiv.org Artificial Intelligence

Calibrating agent-based epidemic models is computationally demanding. We present a supervised machine learning calibrator that learns the inverse mapping from epidemic time series to SIR parameters. A three-layer bidirectional LSTM ingests 60-day incidence together with population size and recovery rate, and outputs transmission probability, contact rate, and R0. Training uses a composite loss with an epidemiology-motivated consistency penalty that encourages R0 \* recovery rate to equal transmission probability \* contact rate. In a 1000-scenario simulation study, we compare the calibrator with Approximate Bayesian Computation (likelihood-free MCMC). The method achieves lower error across all targets (MAE: R0 0.0616 vs 0.275; transmission 0.0715 vs 0.128; contact 1.02 vs 4.24), produces tighter predictive intervals with near nominal coverage, and reduces wall clock time from 77.4 s to 2.35 s per calibration. Although contact rate and transmission probability are partially nonidentifiable, the approach reproduces epidemic curves more faithfully than ABC, enabling fast and practical calibration. We evaluate it on SIR agent based epidemics generated with epiworldR and provide an implementation in R.


Detecting and measuring respiratory events in horses during exercise with a microphone: deep learning vs. standard signal processing

Parmentier, Jeanne I. M., Aarts, Rhana M., Hernlund, Elin, Rhodin, Marie, van der Zwaag, Berend Jan

arXiv.org Artificial Intelligence

Monitoring respiration parameters such as respiratory rate could be beneficial to understand the impact of training on equine health and performance and ultimately improve equine welfare. In this work, we compare deep learning-based methods to an adapted signal processing method to automatically detect cyclic respiratory events and extract the dynamic respiratory rate from microphone recordings during high intensity exercise in Standardbred trotters. Our deep learning models are able to detect exhalation sounds (median F1 score of 0.94) in noisy microphone signals and show promising results on unlabelled signals at lower exercising intensity, where the exhalation sounds are less recognisable. Temporal convolutional networks were better at detecting exhalation events and estimating dynamic respiratory rates (median F1: 0.94, Mean Absolute Error (MAE) $\pm$ Confidence Intervals (CI): 1.44$\pm$1.04 bpm, Limits Of Agreements (LOA): 0.63$\pm$7.06 bpm) than long short-term memory networks (median F1: 0.90, MAE$\pm$CI: 3.11$\pm$1.58 bpm) and signal processing methods (MAE$\pm$CI: 2.36$\pm$1.11 bpm). This work is the first to automatically detect equine respiratory sounds and automatically compute dynamic respiratory rates in exercising horses. In the future, our models will be validated on lower exercising intensity sounds and different microphone placements will be evaluated in order to find the best combination for regular monitoring.